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Abstract. Information spread in social media depends on a number of 
factors, including how the site displays information, how users navigate 
it to find items of interest, users’ tastes, and the ‘vitality’ of information, 
i.e., its propensity to be adopted, or retweeted, upon exposure. Prob¬ 
abilistic models can learn users’ tastes from the history of their item 
adoptions and recommend new items to users. However, current models 
ignore cognitive biases that are known to affect behavior. Specifically, 
people pay more attention to items at the top of a list than those in 
lower positions. As a consequence, items near the top of a user’s social 
media stream have higher visibility, and are more likely to be seen and 
adopted, than those appearing below. Another bias is due to the item’s 
fitness: some items have a high propensity to spread upon exposure re¬ 
gardless of the interests of adopting users. We propose a probabilistic 
model that incorporates human cognitive biases and personal relevance 
in the generative model of information spread. We use the model to pre¬ 
dict how messages containing URLs spread on Twitter. Our work shows 
that models of user behavior that account for cognitive factors can better 
describe and predict user behavior in social media. 

Keywords: social media, information diffusion, cognitive factors 


1 Introduction 

Online social networks can dramatically amplify the spread of information by 
allowing users to forward information to their followers, and those to their own 
followers, and so on. Predicting how people will respond to information is of 
immense practical and commercial interest. Prediction can guide the design of 
more effective marketing and public awareness campaigns, for example, those an¬ 
nouncing the locations of clinics dispensing the flu vaccine. Researchers believe 
that information spread in social media is a complex process that depends on 
the nature of information |21) , the structure of the network m the strength 
of social influences BED], as well as user interests and topic preferences mm- 
These factors are thought to render information diffusion in social media un¬ 
predictable m, although researchers have identified some features that weakly 
correlate with the size of information cascades [Bi. On the other hand, more 
progress has been made addressing information spread in social media as a so¬ 
cial recommendation problem. In this case, probabilistic models are used to learn 
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users’ topic preferences from the history of their item adoptions and predict what 
items in their social media stream users will adopt [17123114] . 


Existing models of social recommendation largely ignore cognitive factors 
of user behavior. One such aspect is position bias. Due to this cognitive bias, 
the amount of attention an item receives strongly depends on its position on 
a screen or within a list of items. Position bias is known to affect the answers 
people select in response to multiple-choice questions CHE], where on the screen 
they look BE, and the links on a web page they choose to follow mm Also 
as a consequence of position bias, items near the top of a user’s social media 
stream are more salient, and therefore, more likely to be viewed, than items in 
lower positions [16]. 


To distinguish position-based salience from other psychological effects, we 
refer to it as an item’s visibility. After viewing the item, the user may decide 
to adopt it. She adopts information either because it is personally relevant to 
her or because it is generally interesting. To handle the former case, the model 
must include a hidden topic space which can be used to compute the relevance 
of items to users. The user may also adopt an item that is not strictly relevant, 
but interesting nonetheless. Such items are usually viral mernes, such as breaking 
news, that have a high fitness, i.e., propensity to spread upon exposure regardless 
of the interests of adopting users. 


In Section[2] we introduce a conceptually simple model of information spread 
that captures the factors important to information spread: item’s visibility , fit¬ 
ness, and its personal relevance to user’s interests. An item’s visibility depends 
on its position in the user’s social media stream. However, since position data is 
often not directly available, we estimate visibility from user’s information load. 
This quantity measures the number of items a user has to inspect before finding 
a specific item to adopt, and it is given by the number of new messages arriving 
in the user’s stream and the frequency the user visits the stream. The greater 
the number of new messages in the stream — either because the user follows 
more people or because she rarely visits her stream — the less visible any partic¬ 
ular item is. Accounting for visibility allows us to learn a better model of user’s 
interests from the history of her item adoptions. When the user does not adopt 
an item, the model allows us to discriminate between lack of interest and failure 
to see the item. While this simple model ignores some of the nuances of infor¬ 
mation spread, it has very high predictive power. In Section [3j we evaluate the 
proposed model on a social recommendation task using Twitter. We study the 
impact of visibility, item fitness, and personal relevance on information diffusion 
in Twitter in aggregate and through illustrative individual examples. Our study 
demonstrates that models of user behavior that account for cognitive biases can 
better describe and predict user behavior in social media, and that information 
spread is more predictable than previously thought. 
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Fig. 1 . The Vip model with user topic ( u ) and item topic (8) profiles, personal relevance 
of an item to user (<5), visibility to user ( v ), item fitness ( 77 ), expected number of new 
posts user received (p) and item adoption (r). N is the number of users and M is the 
number of items. 


2 The VIP Model 


We describe Vip, a model that captures the three basic ingredients of information 
spread in social media: item’s fitness and its visibility and personal relevance to 
the user. Vip is based on social recommendation models, whose goal is to recom¬ 
mend only the relevant items to users 11712311411 R| . In social recommendation, 
each user is assigned a vector of topics, which serve as her interest profile, and 
each item also has some topics. Once these hidden vectors are learned from 
the history of user item adoptions, it is possible to calculate an item’s personal 
relevance to the user. 

Social media users adopt items even if they had not earlier demonstrated 
a sustained interest in their topics. This is often the case with viral, general- 
interest items, such as breaking news or celebrity gossip. We use the term fitness 
(or ‘virality’) to describe an item’s propensity to be adopted upon exposure. 

The key innovation of Vip is to introduce visibility into the generative model 
of item adoption. Visibility conceptually simplifies the mechanisms of informa¬ 
tion spread and explains away some of the complexity associated with it, for 
example, the network effects observed by mm- Visibility explicitly takes 
into account the process of information discovery in social media. Online social 
networks are directed, with users following the activities of their friends. A user’s 
message stream contains a list of items her friends adopted or “recommended” 
to her, chronologically ordered by their adoption time, with the most recent item 
at the top of the stream. We consider a user to be exposed as soon as the item 
enters her stream; however, exposure does not guarantee that the user will ac¬ 
tually view the item. The probability of viewing — visibility — depends on the 
item’s position in the user’s stream m- Due to a cognitive bias known as posi¬ 
tion bias [IS], a user is more likely to attend to items near the top of the screen 
than those deeper in the stream J3j . Below we discuss a method to quantitatively 
account for this visibility. 
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Figure [T] graphically represents the Vip model. It considers a user i with a 
user-topic vector u, and an item j with an item-topic vector 6. Vip generates an 
adoption of an item j by user i as follows: 



(1) 


where rjj ~ Af( 0, A^ 1 ), is the fitness (or interestingness) of item j, which repre¬ 
sents the probability of adoption given the user viewed it mm- The precision 
parameter Cij serves as confidence for adoption v, t represents the visibility 
of item to user and Sij represents user i’s interest in item j. We define g r as a 
linear function for simplicity. One of the key properties of Vip lies in how a user 
adopts items that have the same visibility. We assume that user adopts either 
items that are relevant to her or interesting in general ( rjj ). 

Users discover items by browsing through their message stream. As argued 
above, the position of an item in the stream determines its visibility, the likeli¬ 
hood to be viewed. However, an item’s exact position is often not known. Instead, 
we estimate its average visibility from the available data. This quantity depends 
on user’s information load m, i-e-, the flow of messages to the user’s stream, 
and the frequency the user visits the site. The greater the number of new mes¬ 
sages user receives between visits to the site, the less likely the user is to view 
any specific item. Following [12] . we estimate visibility of an item to user i as: 


Vi ~ Y, ( G (V(1 + ft). £)(1 - IG(m, A, m 


( 2 ) 


L 


The first factor gives the probability that L newer messages have accumulated 
in user i’s stream since the arrival of a given item. The accumulation of items is 
a competition between the rates friends post new messages to the user’s stream 
and the rate the user visits the stream to read the messages. The ratio pi of these 
rates gives the expected number of new messages in a user i’s stream since item 
j’s arrival. Taking friends activity and user activity each to be a Poisson process, 
the competition gives rise to a geometric distribution with success probability 
p = 1/(1 + Pi): G = (1 — p) L p. We will revisit how we estimate pi in the section 
below. The second factor of Eq. [2] gives the probability that user i will navigate 
to at least L + l’st position in her stream to view the item. This is given by the 
upper cumulative distribution of an inverse gaussian IG with mean p and shape 
parameter A and variance p 3 /A: 



( 3 ) 


This distribution has been used to describe the “law of surfing” |T3] , and it rep¬ 
resents the probability the user will view L items on a web page before stopping. 
Therefore, the cumulative distribution of IG gives the probability the user will 
view at least L items, hence, navigating to L + l’st position in her stream. 

We calculate personal relevance of the item j to user i as: 
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where symbol T refers to the transpose operation, u-j represents topic profile of 
user i, 9j represents topic profile of item j and gs is linear function for simplicity. 
We represent topic profiles of users and items in a shared low-dinrensional space 
as follows. 


Ui ~ Af(0, Xu 1 Ik) 

6j ~ N(0, Ag 1 Jif) 


( 5 ) 


where K is the number of topics. Note that if we only use personal relevance (5) 
and ignore visibility and fitness, Vip model reduces to probabilistic matrix fac¬ 
torization (PMF) model [55] that learns latent topics from user-item adoptions. 

The generative process for item adoption through a social stream can be 
formalized as follows: 

For each user i 

Generate iq ~ 7V(0, \~ x Ik) 

Generate v t ~ E, (G(l/(1 + Pi ), Z)( 1 - IGQi, A, Z))) 

For each item j 

Generate 9j ~ Af( 0, A ^Ik) 

Generate rjj ~ Af{0, A” 1 ) 

For each user i 

For each recommended item j from friends 

Generate the adoption ry,- ~ A f (i’ig r (Sij +Vj), c”- 1 ) 

Here Sij = uJOj , \ u = <Jr/ a ui = a r/ a 9 i an< i A^ Lack of adoption by 

user i of item j (r t j = 0) can be interpreted in two ways: either user saw the item 
but did not like it, or user did not see the item but may have liked it had she seen 
it. While other models partly account for lack of knowledge about non-adoptions 
using smoothing [53], we properly model visibility of items to users. We set Cj, 
to a high value a r when = 1 and a low value b r for items recommended by 
friends and c r for the rest when r.^ = 0 ( a r > b r > c r > 0). In this paper, we 
use the confidence parameter values, a r = 1.0, b r = 0.03 and c r = 0.01, for dj. 

2.1 Learning Parameters 

To learn model parameters, we follow the approaches of [55IT3] and develop 
coordinate ascent, an EM-style algorithm, to iteratively optimize the variables 
{ui, 9j, rjj} and calculate the maximum a posteriori estimates. MAP estimation 
is equivalent to maximizing the complete log likelihood (£) of U, V, 9 , rj and R 
given A„, A e , A^, g, A and p. 


N 


L 


2 


U i Ui + H l0g + Wpi/pi + 1 ) Z ( 1 ~ IG (/b A ’ 0 



J 


J 


J 


(6) 
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Given a current estimate, we take the gradient of t with respect to zq, 9j, and 
r]j and set it to zero. The update equations are: 

Ui <- (A u I k + OviCiViO T ) 1 GC i (ViRi - ViV v i ) 

Oj <- (X e I k + UVCjVU T )~ 1 UCj (" VRi - r]jVVI N ) 

rjj t— (A,, + v T Cjv) 1 v T Cj (Rj — VU T 0j) 

where Cj is a diagonal matrix of confidence parameters Cij. Item visibility to 
user i, Vi, is represented as a diagonal matrix V or in vector format as v. We 
define 0 as K x M matrix, U as K x N matrix and Rj as vector with values 
for all pairs of users i for the given item j. 

2.2 Prediction 

After parameters are learned, Vip can be used to predict item adoptions by a 
user. For user-item adoption prediction, user i’s adoption of item j retweeted by 
a friend is obtained by point estimation with optimal variables {9* , u* , v*, 77 *}: 

E [ra\D\ «E[t*|2>] r (E[5y|2>] + Efo|2>]) 

' ,J -Vi (a, 9j + rjj ) 

where V is the training data. The adoption probability is decided by user visi¬ 
bility v* , user topic profile u* , item topic profile v* , and item fitness ij* . 

3 Evaluation 

In this section we demonstrate the utility of the Vip model by applying model 
to data from the social media Twitter and evaluating its performance on the 
prediction tasks. We collected tweets containing a URL to monitor information 
spread over the social network from Nov 2011 to Jul 2012. We start by monitoring 
potential seed URLs from streaming APIs and collected the entire history using 
the Twitter REST APIs to reconstruct their sharing history. This yielded 12.5M 
tweets with 9.5M users. 


3.1 Model Selection 

First, we study how parameters of Vip affect the overall performance of user- 
item adoption prediction using recall® 3. We use the same “law of surfing” 
parameters, /r = 14.0 and A = 14.0, as gam did in their study of Twitter 
and another social media site. The expected number of new posts including a 

T tti t • • i • 1 ii , (url posts received) / , (visits) mi 

UKJL user i received, is computed by rate\ /rate\ . ihe 

rate rate[ posts recelved ' 1 j s proportional to the number of friends (Nf rd ^) i fol¬ 
lows and their average posting frequency m- To estimate posting frequency of 
all users, we have to track all their behaviors. Instead of tracking all users, 
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we estimate it using the typical URL posting rates of users from our data: 
ratef osts recemed ') = 1.4 * Nf rd ^y User i visits Twitter at a rate rate^f'' slts \ 
This number is not available; however, we expect it to be proportional to the 
number we do observe: the number of posts of user i ( N posts ^). jT2] estimated 
that average number of visits per post was 38 for Twitter users. Also, since 
around 20% of tweets include a URL [5], the posting rate of user i becomes 
rate[ vlslts) = 7.6 * N posts yy 

For the PMF model, we vary the parameters I\ €{10, ... , 200}, X u and 
A g €{1CV 4 ,..., 10 4 } by using grid search on validation recommendations. Through¬ 
out this paper, we set parameters K = 30, A„ = 10 -3 , A g = 10~ 3 both for PMF 
and Vip that performed the best for PMF. For the fitness parameter, we vary 
X v € {10 -4 ,..., 10 4 }, while we fix Xg = 10~ 3 and X u = 10~ 3 and set X v = 10 4 . 

3.2 User-Item Adoption Prediction 

In the prediction task, we sort the items by r i:l . the probability of adoption by 
user i. and calculate the fraction of the X top-ranked items that the user ac¬ 
tually adopted. A user may not adopt an item either because she did not see 
it or because she does not like it. This makes it difficult to use precision to 
evaluate prediction results. Instead, we use recall@X (=N(items in top X user 
adopted)/N(items user adopted)) to measure model’s performance on the pre¬ 
diction task. To summarize performance of the prediction algorithm, we average 
recall values over all users. 

We divide each user’s adopted items into five folds and construct the training 
set and the test set. We use five-fold cross validation and compare performance 
of Vip to three baseline models: Random, Fitness and Relevance. The Ran¬ 
dom baseline chooses items at random from among the items in user z’s stream, 
i.e., items adopted by z’s friends. The baseline Fitness uses item fitness values 
(77) learned by Vip to recommend X highest fitness items. The baseline Rele¬ 
vance bases its recommendations on user-topic and item-topic vectors learned 
by PMF to recommend X most relevant items. 

Figure[2](a) shows the models’ overall performance on the user-item adoption 
prediction task when we vary X, the number of recommendations made by each 
model. Note that a better model should provide higher recall@X for different 
A'. Vip outperforms all baselines, but the improvement is especially dramatic 
when the number of recommended items is small: recall© 3 was 0.30, 0.17, 0.16, 
and 0.12 for Vip, Relevance, Fitness, and Random models respectively. Note 
that as the number of recommendation (A) increases, the recall of all models 
improves, however, at the expense of precision. 

Figure [ 2 ] (b) shows how prediction performance on the user-item recommen¬ 
dation task varies with user activity level, that is how the number of items 
adopted by the user in the training set affects recall© 3 on the test set. The per¬ 
formance of the Random baseline, which recommends three randomly chosen 
items from the user’s stream, does not vary with user activity level, as expected. 
Similarly, Fitness baseline does not vary significantly with activity, since it de¬ 
pends only on the propensity of the item to spread. Both Vip and Relevance 
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Fig. 2. Error bars are shown indicating standard deviation with the upper bar and 
the lower bar. (a)Recall of user- item adoption prediction with different numbers of 
recommended items. The number of topics was fixed at 30. (b) Average recall® 3 of user- 
item adoption prediction for different activity levels of users with 30 topics. (c)Cascade 
size vs expected values of item fitness plus personal relevance E(I+P) for all adopters. 
The size and color of each circle represents the expected value of that item’s visibility. 


improve with increasing user activity as they can learn better user-topic profiles 
with more training data. Note that for low activity users, whose interests are 
not well-known, recommending items based on personal Relevance performs 
about the same as picking items based on their fitness, but as more can be 
learned about user’s preferences, Relevance outperforms picking items based 
on their fitness or picking them randomly from the user’s stream. Vip handily 
outperforms baselines over all user activity levels. This shows that accounting 
for visibility dramatically improves predictability of user item adoptions in social 
media compared to using personal relevance or item fitness alone. 

4 Visibility vs Item Fitness vs Personal Relevance 

We analyze URL cascades on Twitter by examining how the three factors learned 
by the Vip model contribute to their success. Depending on the characteristics 
of the community that the URL has reached, fitness can vary. In our data set, 
URLs that have been retweeted within a community sharing a specific hobby or 
interest, tend to have high fitness values. Since members share common topic 
preferences, items received relatively high adoption rates per exposure, which 
also often translates into quick adoption. High fitness means high adoption rates 
per view with statistically significant 0.85 correlation with cascade size. However, 
4% of the URLs have fitness values that are negatively correlated with cascade 
size and 40% of the URLs show no correlation between fitness and cascade size. 
Apparently fitness by itself cannot explain the spread of information, and other 
factors, such as visibility and personal relevance also have to be considered. 

We separate the effect of item quality from its visibility to the user. We 
define quality very loosely as the combined effect of its fitness and relevance to 
adopters, and measure it by the expected value of these variables. This definition 
aims to make quality specific to the item itself, and separate from the details of 
how users may discover it. Figure[2](c) shows how the size of cascades in our data 
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Table 1 . Cascade size, expected values, descriptions on Youtube video URLs 


Descriptions 

Cascade Size 

E(V) 

E(I) 

E(P) 

Strongbow surfers Neon Night Surfing on Bondi Beach 

84 

49 

-0.04 

85.1 

Jay-Z Music Video 

141 

50.5 

-0.04 

130.7 

Parallels for Mac for Chrome OS and Windows 8 

68 

52.3 

-0.05 

71.6 

Bahraini Activist Nabil Rajab 

116 

62.1 

-0.03 

127.6 

Ellen’s Swaggin’ Wagon — a marriage Proposal 

87 

65.6 

-0.13 

80.2 

UNICEF - Making headway toward an AIDS-free generation 

102 

71.7 

-0.11 

100.6 

Whitney Houston Video 

120 

73.5 

-0.13 

127.9 

Paul McCartney’s message from Moscow 

109 

87 

-0.22 

102.2 

Ian Somerhalder Foundation 

143 

98.6 

-0.24 

150.8 


set depends on item quality and visibility. Each circle represents a URL, with 
its color encoding the expected visibility of the URL. Not surprisingly, higher 
quality URLs have larger cascades. More interestingly, some of the variance of 
cascade size can be explained by visibility: for URLs of similar quality, the more 
visible URLs spread more widely. In other words, for items cascading through a 
network where users have similar topic preferences, the total size of the cascade 
is decided by their visibility. 

Next we illustrate the contributions of the three factors using specific case 
studies. There were 205 URLs to Youtube videos in our data set, with examples 
shown in Table |T] Two of the most popular URLs in our data set were “Jay-Z 
Music Video” and “Ian Somerhalder Foundation ”, which were both adopted 
more than 140 times through friends’ recommendation. The fitness of the “Jay- 
Z Music Video” is six times higher than that of “Ian Somerhalder Foundation”, 
but has half the expected visibility. Therefore, the high fitness value of “Jay-Z 
Music Video” makes up for the relatively low visibility and reaches a similar 
number of adoptions as “Ian Somerhalder Foundation”. 


5 Conclusion 

In this paper, we proposed ViP, a model that captures the mechanisms of in¬ 
formation spread in social media. ViP can recommend items to users based on 
how easily users find an item in their stream, how well the item aligns with their 
interests, and the item’s propensity to be adopted upon exposure. Prediction 
is surprisingly accurate, considering the crude estimates of visibility. Knowing 
visibility more accurately will further improve prediction performance. We plan 
to extend our model to take into account descriptions of items. We will further 
study the role of network structure and the relationship between visibility, item 
fitness, and personal relevance on information sharing. 
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